Barcoding and species delimitation of Iranian freshwater crabs of the Potamidae family (Decapoda: Brachyura)

Freshwater ecosystems are under multiple threats in modern times such as water extraction for human consumption, industries and agricultural activities, water contamination and habitat destruction for example. At the same time the biodiversity of these ecosystems are often poorly studied, especially in arid countries such as Iran. In this work, we study one of the ecologically important members of Iranian freshwater fauna, freshwater crab species of the genus Potamon. Here, we barcoded the different populations occurring in the country and delimited the species to allow for a better understanding of their distribution and taxonomy. In this study, we evaluated the taxonomical statues of Potamon species in Iran using genetic data. In addition, we created the first barcoding reference for Iranian freshwater crabs, which is an important resource for future environmental and conservation studies.

www.nature.com/scientificreports/ molecular techniques, one could easily identify different populations, directly or indirectly (i.e., environmental DNA approaches). Molecular approaches present their own set of challenges, one of them being the availability of easily accessible reference databases. In this study, we sampled different populations of freshwater crabs of the Potamon genus to (i) identify their taxonomic placement, (ii) evaluate the taxonomic validity of the recognized species using molecular data, and (iii) create a barcode reference for different species inhabiting Iranian freshwater ecosystems.

Results
The final sampling resulted in 110 individuals from six species (Table 1), which covers all the major freshwater bodies of Iran inhibited by this genus (Fig. 1). The final dataset consisted of 923 positions (mean sequence length of 798 bp), from which 205 were parsimony informative, 36 singletons and 682 invariable sites. ModelFinder analysis did not merge any partitions; therefore, each codon position formed an independent partition. The resulting phylogenetic tree (Fig. 2) does not have enough resolution to recover the phylogenetic relationships of the genus. However, each species forms a relatively clear cluster, which helps identify the species boundaries. All species recovered highly supported monophyletics, with the exception of Potamon persicum. The only two sequences representing P. bilobatum in our study were clustered inside the P. ibericum clade, making them unidentifiable from the latter mentioned species. Sequences identified as P. gedrosianum were placed with high support as the sister group to all other species of Potamon from Iran. P. transcaspicum was only represented by a single sequence. Despite having a wide distribution and being overrepresented in this study, P. ibericum does not show a clear population structure. The genetic distances observed within each species were highest in P. ibericum, with a 2% genetic distance ( Table 2). The shortest genetic distance between sister species is 3% between P. persicum and P. ilam.

Discussion
At present, the Iranian members of the genus Potamon are represented by 9 species in the literature: P. bilobatum; P. elbursi; P. gedrosianum; P. ibericum; P. ilam; P. persicum; P. ruttneri; P. strouhali and P. transcaspicum. Based on our results, we suggest that the taxonomic status of P. bilobatum should be studied in more detail, and our study supports the synonymy of P. bilobatum with P. ibericum. As seen in Fig. 2, both nominal species are indifferent from each other in the tree. This result relies on the sequences of samples identified in other studies 12 , where the paratypes of the P. bilobatum have been sequenced. Even if the COI barcode region did separate perfectly the other species studied here, a single marker might not be sufficient to confirm the taxonomy of the genus. Therefore, we believe more specific studies on the subject are needed to resolve P. bilobatum's taxonomic status. The result of our analyses divides the samples identified as P. persicum into two independent lineages, which could be caused by the lack of resolution and support in that part of the tree. This could be improved with a higher sampling size for the populations of this species. On the other hand, the average genetic distance within all samples identified as P. persicum was comparable to the average genetic distances within P. ibericum samples. This supports the idea that the structure observed in the tree for P. persicum corresponds to population structures observable in widespread species and is probably not due to a speciation event. The interspecific and intraspecific genetic distance gap in Iranian members of the Potamon genus seems to be a value between 2 and 3% genetic distance.

Conclusions
In this study, we present the first barcode reference for different populations of potamid crabs inhabiting Iranian freshwater bodies. We evaluated the taxonomic statuses of different described species using molecular data that showed rather high genetic diversity within species. This is a first step to improve the identification of the different species for future studies using molecular techniques. Our results offer an important molecular resource for environmental and conservation studies. We believe these results are especially important these days, as eDNA approaches are becoming an important part of all conservation and biodiversity studies, and these approaches rely strongly on molecular references. Proper species identification is the basis for future studies on the ecology and conservation of these highly susceptible species to climate change.

Methods
Taxon sampling. A total of 35 specimens from 19 localities were sampled in this project, covering the main distribution range of the genus in Iran. In addition, all available barcode sequences from Iran in GenBank, a total of 75, were downloaded and included in the study (Table 1 and Supplementary Material). Other available COI sequences (eleven in total) from Iran (accession numbers LN833869-LN833879) were omitted from the study, as they corresponded mainly to the second half of the COI gene, which overlapped very shortly with the barcode region, and the rest of our dataset. These sequences were identified as P. elbursi, which is represented in our study by other better suited sequences. To root the phylogenetic tree, the barcode sequences for two other potamid species were downloaded from GenBank, Socotra pseudocardisoma (AY803585) and Johara tiomanensis (AB290644).
We fixed the specimens sampled directly in absolute or 95% ethanol by injecting them into their body and covering them in jars. The diluted ethanol in the jars was changed multiple times in the first days as it absorbs the water of the samples while dehydrating and, therefore, preserving them. We observed that ethanol injection and multiple changes are crucial to obtain well-preserved DNA quality samples, as other specimens sampled not following this procedure did not amplify successfully in the majority of cases. The samples were deposited in the collections of the National Museum of Natural Sciences of Madrid (MNCN-CSIC). www.nature.com/scientificreports/ DNA extraction and sequencing. Genomic DNA was extracted from a small sample (less than 2 mm in size) of muscle tissue of an ambulatory leg using the DNeasy ® Blood & Tissue Kit (QIAGEN, Hilden, Germany). DNA purification was carried out using BioSprint 15 and one 5-tube strip per sample. The DNA was eluted in 200 μl of AE buffer and transferred into a 1.5 ml microtube for long-term storage. The barcode region of the cytochrome c oxidase subunit I (COI) gene was amplified using LCO1-1490/HCO1-2198 forward and reverse primers 13 . Amplification was carried out in a total volume of 12 μl per reaction (1-2 μl template DNA, 1 μl of each primer, 2.75-1.75 μl Milli-Q H 2 O and 6.25 μl of DreamTaq Green PCR Master Mix). After confirmation of successful amplification by electrophoresis, PCR products were purified using Exo-SAP-IT ® and sequenced using an external commercial company (Macrogen, Seoul, South Korea) with the same corresponding forward and reverse primers. The obtained sequences were quality checked, trimmed and assembled in Geneious software (Geneious ® 10.2.6; Biomatters http:// www. genei ous. com) 14 . They were aligned with MAFFT 15,16 implemented in Geneious using the auto algorithm option. Each alignment was trimmed, manually adjusted, and visually verified to maximize positional homology, taking into account the genetic codes and the translation frames of the protein-coding gene. All the sequences have been deposited in GenBank (Table 1).
Alignment, phylogenetic inference and species delimitation. The final dataset was aligned using MAFFT implemented in Geneious and screened for sequencing errors. Such poor-quality sequencing errors were found in data downloaded from GenBank and corrected using IUPAC general degenerate nucleotide codes (ex. Gaps resulting in frameshift were replaced with Ns where possible). The maximum likelihood approach was used to construct a phylogenetic tree in IQ-Tree v 2.1.2 17 . The best partitioning scheme and substitution model were found using ModelFinder 18 as implemented in IQ-Tree (-m MFP + MERGE). For the tree reconstruction, 500 nonparametric bootstraps 19 were used to evaluate the nodal support (-b 500). To delimit species within our dataset, we used bPTP 20 . For the bPTP approach, the phylogenetic tree was analysed using the online portal (https:// speci es.h-its. org/). The "rooted tree" and "delete outgroup" options were selected, and the number of MCMC iterations was increased to 5 * 10 5 . All other parameters were left in default. Alignment statistics and  Table 1. The colours used for each species correspond to the same colours used in Fig. 1. Table 2. Estimates of average evolutionary divergence over sequence pairs between and within groups. The number of base differences per site from averaging over all sequence pairs within each group is shown in the diagonal and is marked in bold. "n/c" is shown in one case because only one sequence was available for it.